Search CORE

31 research outputs found

Recommended from our members

Weighted locality-sensitive scheduling for mitigating noise on multi-core clusters

Author: Bhatele A
Gropp W D
Kale V
Publication venue: Lawrence Livermore National Laboratory
Publication date: 04/08/2011
Field of study

UNT Digital Library

Efficient GPU Offloading with OpenMP for a Hyperbolic Finite Volume Solver on Dynamically Adaptive Meshes

Author: Baboulin M.
Bader M
Bhatele A.
Brito Gadeschi G
Hammond J.
Kruse C.
Weinzierl T
Wille M
Publication venue: Springer Verlag
Publication date: 10/05/2023
Field of study

We identify and show how to overcome an OpenMP bottleneck in the administration of GPU memory. It arises for a wave equation solver on dynamically adaptive block-structured Cartesian meshes, which keeps all CPU threads busy and allows all of them to offload sets of patches to the GPU. Our studies show that multithreaded, concurrent, non-deterministic access to the GPU leads to performance breakdowns, since the GPU memory bookkeeping as offered through OpenMP’s map clause, i.e., the allocation and freeing, becomes another runtime challenge besides expensive data transfer and actual computation. We, therefore, propose to retain the memory management responsibility on the host: A caching mechanism acquires memory on the accelerator for all CPU threads, keeps hold of this memory and hands it out to the offloading threads upon demand. We show that this user-managed, CPU-based memory administration helps us to overcome the GPU memory bookkeeping bottleneck and speeds up the time-to-solution of Finite Volume kernels by more than an order of magnitude

Durham Research Online

Predicting application performance using supervised learning on communication features

Author: Bhatele A
Gamblin T
Jain N
Kale L V
Robson M P
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

Abstract not provide

Crossref

UNT Digital Library

Smith College: Smith ScholarWorks

Simulation-Based Performance Analysis and Tuning for a Two-Level Directly Connected System

Author: Bhatele A
Bohm E J
Jain N
Kale L V
Mendes C L
Mokos R M
Totoni E
Zheng G
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/09/2011
Field of study

Hardware and software co-design is becoming increasingly important due to complexities in supercomputing architectures. Simulating applications before there is access to the real hardware can assist machine architects in making better design decisions that can optimize application performance. At the same time, the application and runtime can be optimized and tuned beforehand. BigSim is a simulation-based performance prediction framework designed for these purposes. It can be used to perform packet-level network simulations of parallel applications using existing parallel machines. In this paper, we demonstrate the utility of BigSim in analyzing and optimizing parallel application performance for future systems based on the PERCS network. We present simulation studies using benchmarks and real applications expected to run on future supercomputers. Future petascale systems will have more than 100,000 cores, and we present simulations at that scale

Crossref

UNT Digital Library

Recommended from our members

Exploring Traditional and Emerging Parallel Programming Models using a Proxy Application

Author: Bhatele A
Cohen J
DeVito Z
Haque R
Karlin I
Keasler J
Laney D
Luke E
Richards D
Schulz M
Still C H
Wang F
Publication venue: Lawrence Livermore National Laboratory
Publication date: 27/09/2012
Field of study

UNT Digital Library

Recommended from our members

Performance Characterization of LLNL HPC Codes

Author: Bhatele A.
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 03/05/2013
Field of study

UNT Digital Library

Recommended from our members

Performance Characterization and Validation of mocfe_bone

Author: Bhatele A
Schulz M
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 22/10/2012
Field of study

UNT Digital Library

Heuristic-Based Techniques for Mapping Irregular Communication Graphs to Mesh Topologies

Author: Bhatele A
Kale L V
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/07/2011
Field of study